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Abstract 

Background: Graph-based analysis of f MR I data has recently emerged as a promising approach to study brain 
networks. Based on the assessment of synchronous fMRI activity at separate brain sites, functional connectivity graphs 
are constructed and analyzed using graph-theoretical concepts. Most previous studies investigated region-level 
graphs, which are computationally inexpensive, but bring along the problem of choosing sensible regions and involve 
blurring of more detailed information. In contrast, voxel-level graphs provide the finest granularity attainable from the 
data, enabling analyses at superior spatial resolution. They are, however, associated with considerable computational 
demands, which can render high-resolution analyses infeasible. In response, many existing studies investigating 
functional connectivity at the voxel-level reduced the computational burden by sacrificing spatial resolution. 

Methods: Here, a novel, time-efficient method for graph construction is presented that retains the original spatial 
resolution. Performance gains are instead achieved through data reduction in the temporal domain based on 
dichotomization of voxel time series combined with tetrachoric correlation estimation and efficient implementation. 

Results: By comparison with graph construction based on Pearson's r, the technique used by the majority of 
previous studies, we find that the novel approach produces highly similar results an order of magnitude faster. 

Conclusions: Its demonstrated performance makes the proposed approach a sensible and efficient alternative to 
customary practice. An open source software package containing the created programs is freely available for 
download. 
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Background 

The functioning of the human brain relies on the interplay 
and integration of numerous individual units in a com- 
plex network. Insights into its topology are thus essential 
to promote our understanding of the brain in general, 
as well as its maladaptive states associated with dysfunc- 
tion and disease. An increasingly popular approach to 
the analysis of functional brain networks is based on the 
framework of graph theory [1-4]. A graph is a mathemat- 
ical structure designed for modelling pairwise relation- 
ships, known as edges, between an assortment of objects, 
referred to as nodes. In applications to fMRI, the node 
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set is defined as a collection of brain sites, and edges are 
established by measuring internodal functional connec- 
tivity [5] based on the regions' associated time series. The 
obtained functional connectivity graph, serving as a sim- 
ple model of the brains functional organization in a com- 
plex network, is subsequently examined drawing on a rich 
collection of graph-theoretical metrics that target various 
aspects of its topology [6]. Several studies indicate, for 
instance, that the brains functional network conforms to 
a small-world architecture [1,2,7,8]. Beyond that, the use- 
fulness of graph-based functional connectivity analyses 
has been demonstrated in applications to brain develop- 
ment and aging [9-11], gender differences [12], intellectual 
performance [13], and neurological disorders, such as 
Alzheimer's disease [14,15] or schizophrenia [16,17]. 

In most previous studies, functional connectivity 
graphs have been constructed at the level of regions 
[1,7,10,14,18], meaning that graphical nodes are defined 
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based on a panellation of the brain into regions of 
interest (ROI), each consisting of several voxels. Due 
to the limited number of nodes, such analyses are 
computationally inexpensive and their results are compar- 
atively easy to visualize and interpret. However, region- 
level nodes involve mixing fMRI time series from the 
incorporated voxels, thus obliterating more detailed spa- 
tial information [4,19]. ROI-based analyses are therefore 
highly dependent on the quality of the panellation: If ROI 
boundaries and actual functional boundaries are inconsis- 
tent, the results can be erroneous [20] . Voxel-level analy- 
ses, in contrast, are not subject to these limitations, since 
the panellation inherent to the original data is used for 
node definition [8,11,13,15,21-23]. Consequently, voxel- 
level graphs provide a finer model of the brains functional 
network organization, since the original spatial resolution 
of the fMRI data is preserved [4,24]. 

Because of the large number of nodes, the construction 
and analysis of voxel-level graphs can involve consider- 
able computational efforts. In response, the computa- 
tional burden has often been reduced by sacrificing spatial 
resolution (using relatively large voxels to begin with or 
reslicing the data to a lower resolution) thus reducing the 
number of nodes in the graph [15,25-27]. While reduction 
of spatial resolution is undesirable in general (given that 
the main advantage of fMRI as compared to other meth- 
ods such as EEG/MEG is its superior spatial resolution), 
it can even render a study infeasible, e.g., when investigat- 
ing very small brain structures or different regions that lie 
in close proximity to each other. Efficient algorithms and 
implementations are therefore required in order to take 
full advantage of the data's original spatial resolution [24] . 

Here, we propose a novel approach aiming to increase 
the computational efficiency of voxel-level graph con- 
struction by combining time series dichotomization, 
tetrachoric correlation estimation, and efficient imple- 
mentation, while retaining the full spatial resolution of the 
data. Comparison with conventional graph construction 
(as carried out in previous studies) shows that the new 
approach not only produces highly similar results, but also 
executes an order of magnitude faster. 

Methods 

This section consists of three parts. We begin with a short 
introduction to voxel-level functional connectivity graphs 
and explain their construction from fMRI data. In par- 
ticular, it is established how time series dichotomization 
can be combined with tetrachoric correlation estimation 
to efficiently measure functional connectivity. The sec- 
ond part describes analyses comparing Pearsons r and the 
tetrachoric correlation coefficient r t (1) as correlation esti- 
mators in the controllable environment of synthetic data, 
(2) as measures of functional connectivity in the context of 
graph construction from fMRI data, and (3) with respect 



to the similarity of graphs resulting from (2). The last part 
provides implementation details regarding the programs 
created for this study and assesses their computational 
performance. 

Voxel-level graph construction 

Formally, an undirected binary graph is defined as an 
ordered pair Gb = (N,E), comprised of a set of nodes N 
and a set of pairwise internodal connections, or edges, 
£. Individual edges are unordered pairs {/,;}, where i,j e 
N. Gb can be represented by a binary adjacency matrix 

B \N\x\N\ = {bij)i where bij e {a 1}> ij e K and bij = X 

indicates that {/,/} e E, i.e., that a connection between the 
two nodes i and j exists. In order to represent not only the 
presence or absence of connections but also their strength, 
a graph can be extended by assigning a weight to each 
edge. 

In applications to fMRI data, graph-based analyses rely 
on the derivation of a graphical representation of the 
brains functional network, which is then examined in 
terms of graph theory (Figure 1). Voxel-level functional 
connectivity graphs are constructed based on individual 
voxels as nodes, that is, the set of nodes N is a collec- 
tion of voxels. In the literature, N is often defined as all 
in-brain voxels, or all gray matter (GM) voxels. Functional 
connectivity is estimated between all pairs of nodes based 
on their corresponding time series using a measure of 
association. 

To construct a binary graph, edges are established by 
thresholding the functional connectivity estimates: Two 
nodes are connected by an edge if their functional con- 
nectivity exceeds a given threshold. Alternatively, one can 
take the pairwise functional connectivity matrix as a basis 
for a weighted graph, thus conserving the strength of indi- 
vidual functional connections between nodes [28]. For 
simplicity, and in a manner consistent with the majority 
of previous work on voxel-level functional connectiv- 
ity graphs, we presently focus on binary graphs for our 
analysis. 

Measuring functional connectivity 

In most previous studies investigating voxel-level func- 
tional connectivity graphs, internodal functional con- 
nectivity is measured using Pearsons sample correlation 
coefficient r [2,8,13,15,21-27,29-31]. When using Pear- 
son correlation as a measure of functional connectivity, it 
seems sensible to assume bivariate normality with respect 
to the distribution of pairwise observations arising from 
each pair of voxel time series. This is because Pearson 
correlation may be a poor measure of association if the 
data are not normally distributed [32]. Encouragingly, in 
a recent study employing region-level graphs, data for the 
most part appeared to meet the assumption of bivariate 
normality [33]. 
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Figure 1 Construction and analysis of voxel-level functional connectivity graphs. Starting with the preprocessed fMRI data, all gray matter 
voxels are defined as graphical nodes (1). Using their associated time series, pairwise internodal functional connectivity is measured in terms of 
linear correlation. Typically, this is done using Pearson's r. Alternatively, one can derive binary time series via median-based dichotomization and 
employ tetrachoric correlation estimation (r t ). In both cases, the result is a correlation matrix (2) representing the pairwise functional connectivity 
between nodes. A binary undirected graph, represented by a binary adjacency matrix (3), is obtained via thresholding. Based on the adjacency 
matrix, graph-theoretical metrics, such as the node degree k, are computed (4). 



Assuming bivariate normality between pairs of voxels, 
an alternative correlation estimator, the tetrachoric corre- 
lation coefficient r t [34], can be used instead of r. Given 
two dichotomous variables Xd and y^, r t estimates the 
correlation of the latent continuous-valued variables x c 
and y c associated with x^ and yj t under the assump- 
tion that x c and y c follow a bivariate normal distribution. 
Thus, if we dichotomize, i.e., binarize, the voxel time series 
data, r t can be used to estimate the pairwise correlation 
of the original continuous-valued time series from the 
dichotomized ones. 

Consider two voxels, v and w, and their corresponding 
time series, s v and s w . Using the medians of s v and s wt i.e., 
s v and s w , as dichotomization thresholds, we obtain the 
binary time series d v and d w . Formally, d v ^ = 1 if the sig- 
nal intensity value s v> £ amounts at least to s v and d v ^ = 
0 otherwise, where k e {1, . . . , T} and T is the num- 
ber of acquired fMRI volumes [35]. See Section S.l for 
details. 

By virtue of s v and s wt the pairs (s v j a s W) fc) are divided 
into four partitions corresponding to four quadrants in 
the #-y-plane of a Cartesian coordinate system (Figure 2A, 
bottom). Thus, by counting the number of points falling 
into each quadrant, a pair of voxels gives rise to a 2 x 2 
contingency table, the (relative) frequencies of which can 
be expressed in terms of d v and d w . For example, n\\, the 
frequency of points in time where s v> £ and s W) k amount 
at least to s v and s w , respectively, is given by n\\ = 
Ya=i dy,k ' dw,k- m °ther words, n\\ is the number of 
points where both d v and d w are equal to 1, yielding the 
associated relative frequency p\\ through p\\ = 

The probability masses corresponding to the tables rela- 
tive frequencies are equal to the respective partial volumes 
belonging to the four quadrants in the #-j/-plane under 
the curve representing the bivariate normal distribution 



(Figure 2A). The correlation coefficient r t , for which these 
partial volumes resemble the relative frequencies in a 
given table, is an estimate of the population correlation p 
belonging to the underlying distribution. Since a 2 x 
2 contingency table is uniquely defined by the marginal 
probabilities and one joint probability, r t can be found 
by solving, e.g., p n = f®-i(p 9o) f^-i (p09) f(^z y )dz x dz y) 
where O is the standard normal distribution function, 
O -1 is its inverse, and / is the probability density func- 
tion of the bivariate normal distribution. While this 
would typically be solved using numerical techniques, 
an analytical solution, r t = — cos(27r^n), exists for the 
case under consideration (Figure 2B). See Section S.2 for 
details. 

Given a pair of voxels, we can determine p\\ from 
the dichotomized time series and use the relationship 
r t = — cos(27Tj?ii) in order to obtain r^. As a consequence, 
r t can be used instead of r to estimate pairwise functional 
connectivity in the process of graph construction from 
fMRI data (Figure 1). 

Simulations 

Building upon the theoretical considerations presented 
above, we analyzed the characteristics of r t in the control- 
lable environment of synthetic data. More specifically, we 
assessed its usefulness relative to r in estimating the cor- 
relation parameter p of bivariate normal populations of 
known properties. 

Data 

Bivariate normal populations were generated, such that 
each of them exhibited a predefined population correla- 
tion p, where p e {-0.99, -0.98, . . . , 0, . . . , 0.98, 0.99}. 
Then, 10000 bivariate samples of size T (one sample rep- 
resents a pair of time series of length T) were randomly 
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Figure 2 Dichotomization and tetrachoric correlation estimation. Consider a sample ((xi,yi), (xi^i), (xt.Yt)) of size T where (x,y) is 
distributed according to bivariate normality. Further, let x = X],x 2l . . . ,x T and y = y ]l y 2l ... ,y T denote the samples from x and y, respectively. Using 
# and y, as thresholds, * and y can be dichotomized resulting in the binary samples Xd and y d . A: As an example, the density of a bivariate normal 
distribution (p = 0.7) is shown (top, 3D curve) along with a sample (bottom, points in thex-y-plane) drawn from that distribution. By virtue of the 
two linesx = * andy = y, thex-y-plane is divided into four quadrants, such that the counts of sample points per quadrant form a 2 x 2 contingency 
table. The (relative) frequencies in the contingency table can also be expressed in terms of Xd and y d (e.g., ri] 1 is the number of indices where both 
Xd and y d are equal to 1 , and p\ 1 = ri] 1 T~ ] ). The probability masses corresponding to the table's relative frequencies are equal to the respective 
partial volumes belonging to the four quadrants in the x-y-plane under the bivariate normal's curve. The tetrachoric correlation coefficient r tl for 
which these partial volumes resemble the relative frequencies in a given table, is an estimate of the population correlation parameter p belonging 
to the underlying distribution. B: Relationship between p\ 1 and r t . Given Xd and y d , r t can be found using r t = — cos(27ipi 1 ). For details see text. 



drawn from each of the populations. For each sample, r 
and r t were calculated. Prior to calculating the latter, the 
data were dichotomized as described above. The entire 
procedure was conducted separately for two different 
sample sizes, T = 100 and T = 300, resulting in two data 
sets, where the choice of these numbers was guided by the 
parameters of the real fMRI data we analyzed. 

Correlation estimation 

For each data set and estimator p e {r,r t }, joint his- 
tograms (p,p) with associated marginal histograms were 
calculated. For the joint histograms a linearly spaced 199 x 
199 grid was used, such that bin centers in both dimen- 
sions corresponded to correlations exhibited by the gener- 
ated bivariate populations. For each estimator, means and 
standard deviations, as well as mean signed differences 
MSD(/5, p) were calculated per p-bin. Mean signed dif- 
ferences are defined as MSD(/5,p) = n~ l Y^(Pi — p)> 
where n is the number of samples per p-bin, i.e., n = 
10000, and pi is the correlation estimate for sample L 
Since p is not known in the case of real data, additional 
joint histograms (r t , r) were calculated in order to facilitate 
comparability with respect to real data applications. As 
both estimators exhibit errors with respect to p, Deming 
regression a was used in order to fit a linear relationship to 
the (rt, r) data. 



Data 

MRI data were obtained from the "1000 Functional 
Connectomes Project" repository [36,37]: We used the 
"Cambridge" and "Pittsburgh" data sets, contributed by 
R.L. Buckner and G. Siegle, respectively. These data sets 
contain resting-state fMRI data from 198 subjects (75 
males/ 123 females, ages: 18-30 years; imaging parame- 
ters: TR = 3s, voxel size = 3x3x3 mm 3 , number of 
slices = 47; number of volumes = 119) and 17 subjects 
(10 males/7 females, ages: 25-54; imaging parameters: TR 
= 1.5s, voxel size = 3.125 x 3.125 x 3.2 mm 3 , number of 
slices = 29; number of volumes = 275), respectively. Both 
data sets also include anatomical scans for each subject. 

Preprocessing 

Using SPM8 [38] functional data were motion-corrected 
by alignment to the mean functional image, then anatom- 
ical scans were coregistered to the mean functional image 
and segmented. In order to account for low frequency 
intensity drifts and high frequency noise, frequencies 
below 0.01Hz and above 0.1Hz were removed from the 
voxels time series by band-pass filtering, as is custom- 
ary for resting-state data [39]. In order to minimize the 
impact of preprocessing on the data's correlation struc- 
ture, we refrained from spatial smoothing and spatial 
normalization [8,27]. 



Application to fMRI data 

Comparative graph-based analyses of resting-state fMRI 
data were carried out based on r vs. r t as measures of 
functional connectivity. 



Correlation estimation 

Based on r and r t as a measure of functional connec- 
tivity, two voxel-level graphs were constructed for each 
subject from the two data sets. Nodes were defined 
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as supra-threshold voxels in the subject-specific GM 
probability maps obtained from the segmentation (thresh- 
old Oqm = 0.2). To measure internodal functional con- 
nectivity, two correlation matrices, C r and C n , were 
calculated based on all pairwise correlations between 
nodes. C r was obtained by calculating Pearson correla- 
tions based on the voxels' associated continuous-valued 
time series from the preprocessed fMRI data, and C n was 
obtained analogously, except that tetrachoric correla- 
tions were calculated instead of Pearson correlations, and 
binary voxel time series were used instead of continuous- 
valued ones. Again, binary voxel time series were derived 
from the continuous-valued ones through median-based 
dichotomization. In order to compare C r and C n , their 
entries were used to calculate joint histograms (r t , r) in the 
same fashion as for the synthetic data. 

Functional connectivity graphs 

Subject- specific binary functional connectivity graphs, B r 
and B n , were derived from C r and C n , respectively, via 
density-based thresholding: The density k of a binary 
undirected graph B is the proportion of potential edges 
that actually exist, i.e., k = \ N \ 2 qn\-i) - m or der to facilitate 
comparability across graphs, an individual correlation 
threshold 0 was determined for each correlation matrix, 
such that the resulting binary graphs exhibited the same 
density k. Given C, where C e {C r , C r J, and 6, the entries 
of B are given by by = 1, if c,y > 0, and by = 0 otherwise, 
where 1 < /,/ < \N\. 

Node degree maps 

In graph-based fMRI functional connectivity analyses, 
one of the most popular graph-theoretical metrics is 
the node degree, or degree centrality, a measure aiming 
to characterize the importance of individual nodes in a 
binary graph. Given a binary graph B, the degree k{ of a 
node i is defined as the number of nodes that are con- 
nected to i via an edge, or, more formally, Iq = Hj> 
where /,/ e N and i ^ j [40]. The node degree has recently 
been employed in several neuroimaging studies aiming to 
identify potential hub regions in the human brain [27,31]. 

Here, node degrees k were calculated for all subject- 
specific functional connectivity graphs B r and B rr 
Degrees were standardized in order to afford compara- 
ble scaling across subjects [15]. The spatial distribution 
of degrees was analyzed by constructing /c-maps in indi- 
vidual brain space for each subject. In order to gener- 
ate group average /c-maps for each data set (Cambridge 
and Pittsburgh), the subject-specific /c-maps were spatially 
normalized to ICBM b -template space based on transfor- 
mation parameters estimated with respect to the mean 
functional image using SPM8 [41-43]. Since the normal- 
ized /c-maps have different overlaps due to anatomical 
differences and differing GM masks, a varying number 



of subjects "supports" each standard space voxel. Thus, 
group-level /c-maps were derived by voxel-wise averag- 
ing of the /c- values from the supporting subjects. For 
enhanced reliability, /c-values of voxels supported by less 
than 20% of all subjects were discarded. 

Implementations 

The most time-consuming step when constructing a 
graph from fMRI data consists in the computation of a 
functional connectivity matrix, which here corresponds 
to the computation of a correlation matrix based on r or 
r t . In the following, the programs created for calculating 
the voxel-level pairwise correlation matrices C r and C n 
will be referred to by pec and tet race, respectively. For 
both programs we opted to store only the upper triangu- 
lar part of C, in order to save memory. In doing so, no 
information is lost, since C is symmetric. Because efficient 
implementation plays an important role when aiming to 
accelerate large-scale analyses, implementation was con- 
ducted using the C programming language, providing 
Matlab integration via its C interface MEX. A Matlab 
toolbox and C sources are available for download [44]. 

Calculation ofC r 

Pearsons sample correlation coefficient r is calculated for 
a pair of voxels v and w using 



To avoid redundant operations, subexpressions depend- 
ing on one voxel only are precalculated for all voxels before 
computing C r . 

In order to take advantage of the processors cache 
without the need for explicit knowledge about its size, 
we adopted a so-called cache- oblivious algorithm [45,46] 
to compute the correlation matrix, rather than explicit 
blocking (with a predetermined block size that optimally 
fits the cache). The core idea is to recursively divide the 
problem so that the computations are carried out on 
smaller and smaller blocks of data. Given that the mini- 
mum block size is small enough, there is a division step 
from which on all computations use only data that fits into 
the processor cache (regardless of its size), thus making 
optimal use of the cache by localizing the computations. 
The division scheme we implemented is illustrated in 
Additional file 1: Figure SI, which shows the first three 
steps of dividing the upper triangle of the correlation 
matrix. 

In addition, we exploited SSE2 (Streaming SIMD Exten- 
sions version 2, where SIMD stands for Single Instruction 
Multiple Data) and AVX (Advanced Vector extensions) 
instructions (on processors that support them c ), which 
allow for parallelization on a single core by carrying out 
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the same operation on multiple data elements in paral- 
lel (also known as vectorized computations). Using SSE2 
(AVX), the computation of the numerator of the corre- 
lation coefficient can be parallelized by computing four 
(eight) sums in parallel (if the float data type is used; 
for double, two and four sums, respectively, can be 
computed in parallel). The procedure is illustrated in 
Additional file 1: Figure S2 for SSE2 using float or AVX 
using double (four sums in parallel). 

Calculation ofC rt 

For each pair of voxels, r t is computed in three steps. First, 
the bitwise AND operator is applied to the voxels' associ- 
ated binary time series. Second, the set bits in the result 
are counted to obtain n\\. Third, r t is retrieved from a 
lookup table of the function r t = — cos(27tnnT~ 1 ). The 
table is indexed by n\\ and contains the corresponding 
r t values for those values that n\\ can attain. Depend- 
ing on T being even or odd, these are 0, 1, 2, . . . , ^ or 
1, 2, 3, ... , respectively. 

Storing the binary time series in integers of, e.g., 32 bit, 
32 points in time can be processed in parallel, so that the 
above three steps need to be executed only [7/32] times 
per pair. Hence, it seems conceivable that the computa- 
tional cost in terms of CPU time could be lower for the 
calculation of C n than for the calculation of C r . 

Following the procedure outlined above, two pro- 
grams, tetracc/32 and tetracc/128 were cre- 
ated. tetracc/32 uses 32 bit integers for storing 
binary time series and a 16 bit lookup table for 

bit-counting, tetracc/128 uses mml2 8i variables 

(holding 128 bit each) for time series storage. Using 
the intrinsics _mm_and_sil2 8 and _mm_popcnt_u64 
for bitwise AND and bit-counting, respectively, it is 
expected to improve over tetracc/32, since more 
data can be processed in parallel and no extra mem- 
ory access (lookup table) is needed. While tetracc/32 
is platform-independent, tetracc/128 is only applica- 
ble on fairly modern CPUs, because _mm_and_sil2 8 
and _mm_popcnt_u64 depend on the availability of 
SSE2 and POPCNT instructions, respectively^ 

Parallel versions 

For additional performance gains, parallel versions of pec 
and tetracc have been implemented using multiple 
threads. This aspect is, however, beyond the focus of 
this article, since the resulting benefits relative to single- 
threaded programs are expected to be fairly independent 
of the choice of r versus r t as a measure of internodal 
functional connectivity. 

Performance tests 

In order to assess the performance of the programs 
described above, we compared them to three other 



programs: Matlabs built-in function corrcoef, corr 
from Matlabs Statistics Toolbox, and IPN_f as t Corr, 
a function from the Matlab toolbox ipnvoxelgraph by 
X.N. Zuo. Experiments were conducted from within Mat- 
lab (R2011b) on a desktop computer with an Intel(R) 
Core(TM) 17-3960X CPU (3.30GHz) and 64GB main 
memory running Linux (Kernel 3.4). The C/MEX routines 
that are part of our programs were compiled using the 
GNU C compiler gec (version 4.7.1, optimization level 3). 
In order to prevent programs from making use of multiple 
cores, Matlab was restricted to one CPU core. 

Input data sets (S VxT ) were generated using pseudo- 
random numbers of type float. While the length of time 
series T was fixed at T = 200, the number of voxels V was 
varied between 10000 and 170000 in steps of 10000. The 
maximum number of voxels, 170000, follows from the fact 
that storing the resulting matrix (upper triangular part) 
requires 53.83 GB { V(y ~ l) floats, 4 bytes per float). 
Since the machine used has 64GB of main memory, this 
seemed a sensible choice in order to leave some memory 
for other applications and subsequent processing of the 
matrix. Because corrcoef, corr, and IPN_f as t Corr 
return the complete symmetric matrix, they were only 
tested using input data sets with up to 120000 voxels cor- 
responding to 53.64 GB of memory required to hold the 
matrix (V 2 floats). 

Results 

Correlation estimation 

Overall, the correlation estimation from dichotomized 
data using r t yielded results strongly resembling those 
obtained through estimation from continuous data 
using r. For synthetic data, as expected, r and r t exhibited 
linear relationships to p and also to each other (Figure 3A). 
Standard deviations (with respect to means per p-bin) 
were greater for r t than for r, but were reasonably small 
for both. Peaking at p = 0 with 0.101 and 0.058 (r), and 
0.158 and 0.09 fa), for T = 100 and T = 300, respec- 
tively, they exhibited a gradual decrease towards the range 
limits of p. Naturally, deviations from p were larger for 
T = 100 than for T = 300, because for T = 100 each 
calculated correlation is based on fewer values than for 
T = 300. Close inspection of the mean signed differences 
MSDfa, p) and MSD(r, p) revealed a small systematic 
bias of both r and r t as estimators of p (Additional file 1: 
Figure S3). The expected value of r, £(r), can be approx- 
imated by E(r) = p - pO^) [47,48]. E(r) - p closely 
matched the empirical results from the simulation repre- 
sented by MSDfa p), while MSDfa, p) follows a curve of 
similar shape but larger amplitude. The Pearson correla- 
tion between r and p, r t and p, and r t and r, amounted to 
0.992 (0.997), 0.978 (0.992), 0.986 (0.995), respectively, for 
T=100(T=300). 
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For fMRI data, r t and r followed a linear relationship in 
both data sets, although there was a slight counterclock- 
wise rotation about the origin as reflected in a slope > 1 as 
opposed to 1 for a perfect relation r t = r (Figure 3B). This 
suggests only limited deviation from the assumption of 
pairwise bivariate normality, and, moreover, indicates that 
r^-based graphs closely resemble r-based graphs. Further- 
more, the results for the Cambridge data set (T = 119) 
showed a greater variance than those for the Pittsburgh 
data set (T = 275). Since this feature was also observed 
in the results for the synthetic data sets, we presume that 
this was caused by the smaller sample size for the calcu- 
lation of each sample correlation. The overall correlation 
between r and r t amounted to 0.82 (Cambridge) and 0.85 
(Pittsburgh). 

Note that the vertical gaps in bin occupation in those 
histograms involving r t are due to the fact that r t can 
attain a distinct set of values only, as explained earlier. 
The number of attainable values, and hence the (potential) 
performance of r t as an estimator, increases with T. 

Node degree 

Group average node degree maps from r- and r^-based 
binary graphs B r and B n (derived from C r and C w 
respectively, using a density threshold of k = 0.01) 
are presented in Figure 4. Other thresholds (k g 
{0.05, 0.1}) led to similar results and are hence not shown. 
In accordance with the strong correlation between r 
and r t reported in the previous section, both approaches 
yielded highly similar spatially distributed node degree 
maps (Figure 4A). In line with this, k r and k n were very 
strongly correlated (r(k r ,k rt ) = 0.95 for Cambridge and 
r(k r ,k rt ) = 0.97 for Pittsburgh; Figure 4B), although 
degrees tended to be slightly higher for r t -based than for 
r-based graphs. Prominent clusters of high-degree nodes 
were found within circumscribed regions of the occipi- 
tal (cuneus, precuneus, fusiform and lingual gyri), parietal 
(intraparietal sulcus, superior parietal lobe, temporopari- 
etal junction), temporal (superior temporal gyrus, tempo- 
ral pole, amygdala) and frontal lobes (medial orbitofrontal 
and rostral ventromedial prefrontal cortex) with a sim- 
ilar distribution pattern as reported in previous work 
employing r-based node degree mapping [15,27,31]. 

Performance tests 

The most time-consuming step when constructing a 
graph from fMRI data consists in the computation of a 
functional connectivity matrix. We compared the perfor- 
mance of our programs on this task to three other pro- 
grams: Matlabs built-in function corrcoef, corr from 
Matlabs Statistics Toolbox, and IPN_f as t Corr, a func- 
tion from the Matlab toolbox ipnvoxelgraph by X.N. Zuo. 
Table 1 shows memory requirements, execution times, 
and speedups relative to corrcoef which was selected as 



reference since it is available to any Matlab user out of the 
box and we therefore assume that it has a higher preva- 
lence than corr or IPN_f as t Corr. Figure 5 illustrates 
the performance in terms of data troughput measured in 
correlation coefficients per second. This measure does not 
depend on the performance of a reference program and 
offers more immediate access to the key results. In this 
sense, it is complementary to Table 1. 

In line with expectations, execution times increased 
quadratically with the number of nodes for all pro- 
grams (Table 1). While pecs basic variant (pec /naive) 
was considerably slower than corrcoef (speedup 
0.31 x), its SSE2- and AVX-based variants achieved 
speedups around 1.34 x and 2.08 x, respectively. The 
performance of corr (speedup 1.35 x) was comparable 
to that of pcc/SSE2, while IPN_f astCorr (speedup 
1.63x) ranked between pcc/SSE2 and pcc/AVX. The 
tetracc variants (3 2 and 12 8) were considerably faster 
than all programs computing r with speedups (relative to 
corrcoef) around 5.7 x and 13.5 x, respectively. 

As an aside, we note that IPN_f as t Corr as well as 
pec and tetracc scaled better with the number of cores 
than corrcoef and corr. For example, using 6 cores 
and a data set of 50000 nodes (T = 200), the speedups 
were 1.16x (corr), 2.46x (lPN_f astCorr), 2.32x 
(pcc/SSE2), 3.42x (pcc/AVX), 9.44x (tetracc/32), 
and 20.22x (tetracc/12 8), compared to corrcoef s 
execution time of 18.5 seconds. Using the same data 
set but only 1 core the respective speedups were 
1.35x, 1.63x, 1.33x, 2.08x, 5.7x, and 13.5x as com- 
pared to corrcoefs execution time of 55.9 seconds 
(Table 1). Thus, the speedup gained by using 6 cores 
instead of 1 amounts to 4.56x (I PN_f astCorr), 5.29x 
(pcc/SSE2), 4.98x (pcc/AVX), 4.98x (tetracc/32), 
and 4.49x (tetracc/12 8), while corr and corrcoef 
gain only 2.6 x and 3 x , respectively. 

Note that pec and tetracc require only half the 
amount of memory (column 8) that corrcoef, corr, 
and I PN_f astCorr require (column 2), because they 
store only half of the symmetric matrix for memory effi- 
ciency (Table 1). In addition, corrcoef, corr, and 
I PN_f astCorr failed with an out-of-memory error for 
input data sets with 90000 or more nodes. We assume 
that these programs internally use more memory than 
we expected, since the resulting matrix of correlation 
coefficients would require less than half of the available 
memory (90000 2 • 4Byte = 30.17GB). Hence, the speedups 
of the remaining programs compared to corrcoef could 
not be computed for T > 90000. 

Discussion 

Graph-based functional connectivity analysis at the level 
of individual voxels allows for spatially fine-grained char- 
acterization of functional networks in the human brain. 
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Figure 4 Comparison of node degrees k from r- {k r ) and r r based {k rt ) binary graphs. Subject-specific graphs were derived from the 
correlation matrices C r and C h using a density threshold of k = 0.01 corresponding to correlation thresholds of r = 0.48 ± 0.06 and r t = 0.53 ± 0.04 
(Cambridge) and r = 0.46 ± 0.08 and r t = 0.49 ± 0.07 (Pittsburgh), respectively. Individual degree maps were spatially transformed to MNI space to 
derive group average maps. Note that the degrees were standardized for each subject before averaging, resulting in ranges that one would not 
commonly expect for degrees. For details see text. A Group average degrees on top of axial slices of the MNI brain (shown in neurological 
convention). Top row: k r . Bottom row: k h . B Joint distribution of k r and k rr JUe composite plots consist of a joint histogram and the corresponding 
marginal histograms. Joint probabilities were mapped to grayscale intensities. 



However, with high-resolution data sets, such analy- 
ses can become infeasible due to the computational 
demands involved. Most previous studies investigating 
voxel-level functional connectivity graphs relied on Pear- 
sons r for estimating internodal functional connectiv- 
ity [2,8,13,15,21-27,29-31]. As demonstrated here, the 
tetrachoric correlation coefficient r t constitutes a time- 
efficient alternative to r as a measure of functional con- 
nectivity. 

In order to reduce the computational costs associated 
with the analysis of voxel-level graphs, previous studies 
reduced the data's spatial resolution [15,26,27], spatially 
restricted the graphical edges incorporated into the anal- 
ysis [21], or utilized parallel computing [31]. In contrast, 
efficiency benefits from 77-based graph construction are 
achieved without sacrificing spatial resolution, disregard- 
ing graphical edges, or exploiting parallel computing. An 
open source software package containing the created pro- 
grams is freely available for download [44], Note that 
parallel versions of r- and r^-based graph construction 
have been implemented in addition to the sequential ones, 



thus providing additional efficiency gains that depend on 
the number of available processors. While this aspect is 
not the main focus of this article, as the resulting benefits 
(relative to sequential implementations) can be expected 
to be fairly independent of the choice of r versus r t as 
a measure of internodal functional connectivity, the par- 
allel implementations are still included in the software 
package [44]. 

Even though the dichotomization procedure (a prereq- 
uisite to the computation of r t ) entails discarding infor- 
mation in the time domain, important characteristics of 
the original data appear to be preserved. In applications 
to artificially generated as well as real fMRI data the new 
technique proved capable of closely reproducing results 
obtained in conventional ways. More specifically, the use- 
fulness of the r^-based approach was assessed by compar- 
ison with r in estimating the correlation parameter p of 
bivariate normal populations of known properties. In this 
setting, both the bias and standard deviation were greater 
for r t than for r, but still reasonably small. Thus, r^-based 
correlation estimation yielded results closely resembling 
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Results are averages from 10 runs on a desktop computer with an Intel(R) Core(TM) i7-3960XCPU (3.3GHz) and 64GB main memory. All programs were restricted to one CPU core. Length of time series T was fixed at T = 200. 
\N\: number of nodes. The programs corrcoef (Matlab built-in), corr (Matlab Statistics Toolbox), lPN_f astCorr (X.N. Zuo), and pec (three variants) computed C r , while tetracc (two variants) computed Q r . m [GB]: 
memory requirements for result in GB; t [s]: elapsed time in seconds; s [x]: speedup relative to corrcoef. 
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Figure 5 Performance comparison for computation of correlation matrices. Results are averages from 1 0 runs on a desktop computer with an 
Intel(R) Core(TM) i7-3960X CPU (3.3GHz) and 64GB main memory. All programs were restricted to one CPU core. Length of time series T was fixed at 
T = 200. \N\: number of nodes. The programs corrcoef (Matlab built-in), corr (Matlab Statistics Toolbox), lPN_f astCorr (X.N.Zuo), and pec 
(three variants) computed Q, while tetracc (two variants) computed Q r The performance of each program is given as the number of correlation 
coefficients computed per second. 



those obtained when using r. Beyond that, r- and /7-based 
graph construction and node degree computation were 
carried out for real fMRI data. A strong linear relationship 
was found between r- and r^based correlations indicat- 
ing that r^based graphs closely resemble r-based graphs, 
since the graphs are derived from the correlation matrices. 
In line with this, the spatial distribution of node degrees 
was highly similar for r- and r t - based graphs and also in 
good correspondence with previous work [15,27,31]. 

As data mining approaches are currently gaining 
momentum in the neuroimaging community [36,37,49,50], 
the amount of publicly available experimental data is 
steadily growing. Consequently, development and imple- 
mentation of efficient exploratory methods, such as the 
one presented here, are necessary in order to take full 
advantage of this wealth of data, especially with respect 
to connectivity analyses [51]. Fast construction and sub- 
sequent analysis of graphs may thus open new avenues 
for applications, including those within a clinical set- 
ting, where the voxel-level approach may be of particular 
importance. It is worth noting in this context that voxel- 
level graph construction can operate at the original data 
resolution, thus avoiding the reduction of the analysis' 
spatial sensitivity [4,24] . For example, disease-related pat- 
terns, once identified, may serve as connectivity-based 
biomarkers that could aid, guide, or facilitate diagnostics 
and may increase prediction accuracy with respect to 
disease occurrence, recurrence, severity, or treatment 



outcome. Here, again, efficient methods are essential to 
facilitate assessment of individual patients within a nar- 
row time frame [52]. If combined with efficient tools 
for subsequent analysis, the presented methods for fast 
graph construction may also be useful for online evalua- 
tion of functional connectivity in the context of real-time 
fMRI. This would allow for connectivity-based adapta- 
tion of experimental stimulation and interaction with 
the subject, for example, in task-based fMRI studies, or 
neurofeedback-based training. Taken together, we believe 
that there is a multitude of applications (be them exper- 
imental or clinical) that could benefit from the methods 
presented here, highlighting the growing importance of 
efficient tools for graph-based analysis of voxel-level con- 
nectivity. 

Limitations 

As illustrated by the results, the accuracy of a correla- 
tion estimate naturally increases with the number of data 
points, i.e., the number of scans. Along the same lines, 
it has recently been shown that the reliability of func- 
tional homogeneity increases with scan duration [53]. For 
both correlation estimators, it is therefore recommended 
to avoid a low number of scans (caused, for example, by a 
short scan duration, or a long TR, or both). Since the devi- 
ation from the population correlation p is generally higher 
for r t than for r, a low number of scans will affect r t more 
severely than r. 
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The main focus of this work lies with the compari- 
son of r and r t as functional connectivity estimators. To 
reduce the impact of preprocessing on the data's cor- 
relation structure prior to this comparison, we limited 
the preprocessing of the fMRI data to a minimum. The 
effect of additional preprocessing steps, or a different pre- 
processing pipeline altogether, on the robustness of the 
proposed methods should be subject of future research. 
Unpublished results from our group indicate, however, 
that the comparability of r and r t remains essentially 
consistent. 

Conclusions 

Voxel-level graphs allow for spatially fine-grained anal- 
yses of functional connectivity networks. In order to 
reduce the considerable computational demands involved, 
many previous studies reduced the spatial resolution of 
the data. Here, a new method for graph construction — 
exploiting time series dichotomization and tetrachoric 
correlation estimation— was devised, efficiently imple- 
mented, and compared to the conventional approach 
based on continuous-valued data and Pearsons r. In appli- 
cations to artificially generated as well as real fMRI data 
the new technique proved capable of producing highly 
similar results. Through efficient bit-based implementa- 
tion adapted to the dichotomized data the novel method 
runs an order of magnitude faster while the original spa- 
tial resolution of the data is retained. Hence, its demon- 
strated performance, not only in producing consistent 
results, but in obtaining them substantially faster, makes 
the new approach a sensible and economical alternative 
to customary practice. An open source software pack- 
age containing the created programs is freely available for 
download [44]. 

Endnotes 

a Deming regression is a linear regression method that 
accounts for errors in both variables. 

b International Consortium for Brain Mapping. 

C SSE2 was introduced by Intel with the Pentium 4. It is 
also supported by AMD CPUs starting with the 
Athlon 64. AVX is supported starting with the Sandy 
Bridge (Intel) and Bulldozer (AMD) microarchitectures. 

d POPCNT became available starting with the 
Nehalem (Intel) and Barcelona (AMD) 
microarchitectures. 
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